Wrapper Veriication
نویسنده
چکیده
Many Internet information-management applications (e.g., information integration systems) require a library of wrappers, specialized information extraction procedures that translate a source's native format into a structured representation suitable for further application-speci c processing. Maintaining wrappers is tedious and error-prone, because the formatting regularities on which wrappers rely change frequently on the decentralized and dynamic Internet. The wrapper veri cation problem is to determine whether a wrapper is operating correctly. Standard regression testing approaches are inappropriate, because both the formatting regularities on which wrappers rely, and the source's underlying content, may change. We introduce rapture, a fully-implemented, domain-independent wrapper veri cation algorithm. rapture computes a probabilistic similarity measure between a wrapper's expected and observed output, where similarity is de ned in terms of simple numeric features (e.g., the length, or the fraction of punctuation characters) of the extracted strings. Experiments with numerous actual Internet sources demostrate that rapture performs substantially better than standard regression testing. ii Nicholas Kushmerick, Wrapper veri cation 1
منابع مشابه
Developing a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression
Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...
متن کاملFuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection
Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...
متن کاملStep: Deductive-algorithmic Veriication of Reactive and Real-time Systems ?
The Stanford Temporal Prover, STeP, combines deductive methods with algorithmic techniques to verify linear-time temporal logic speciications of reactive and real-time systems. STeP uses veriication rules, veriication diagrams, automatically generated invariants, model checking, and a collection of decision procedures to verify nite-and innnite-state systems. computer-aided formal veriication o...
متن کاملEthernet Wrapper: Extension of the TCP Wrapper
One of the popular network security programs supporting host access control is the ’TCP Wrapper’ [13]. TCP Wrapper is a software–only system and many computers connected to the Internet are using it. But, TCP Wrapper does ’IP address–based’ access control. IP address is not such a reliable source when authenticating a host. In this paper, we point out two possible attacks against the TCP Wrappe...
متن کاملInterface-based Speciication and Veriication of Concurrency Controllers
We present a modular approach to speciication and veriication of concurrency controllers by decou-pling their behavior and interface speciications. The behavior speciication of a concurrency controller deenes how its shared variables change their values whereas the interface speciication deenes the order in which a client thread should call its methods. We show that the concurrency controllers ...
متن کامل